Several novel evaluation measures for rank-based ensemble pruning with applications to time series prediction

نویسندگان

  • Zhongchen Ma
  • Qun Dai
  • Ningzhong Liu
چکیده

Keywords: Ensemble pruning Time series prediction Rank-based ensemble pruning Complementarity measure for time series prediction (ComTSP) Concurrency thinning for time series prediction (ConTSP) Reduce Error pruning for time series prediction (ReTSP-Trend) Time window size a b s t r a c t Ensemble pruning is a desirable and popular method to overcome the deficiency of high computational costs of traditional ensemble learning techniques. Among various of ensemble pruning methods, rank-based pruning is conceptually the simplest and possesses performance advantage. While four evaluation measures for rank-based ensemble pruning specifically for time series prediction are proposed by us in this paper. The first one, i.e. Complementarity measure for time series prediction (ComTSP), is properly modified from Complementarity measure (COM) for classification. The design idea of ComTSP is, if the error made by the subensemble for a pruning sample is larger than that by the candidate predictor to a certain extent, it is assumed that the predictor is complementary to the subensemble. And the predictor which minimizes the error rate of subensemble on the pruning set will be selected at each selection step. The second one, i.e. Concurrency thinning for time series prediction (ConTSP), is correctly transformed from Concurrency measure (CON) for classification. With ConTSP, a predictor is rewarded for obtaining a good performance, and rewarded more for obtaining a good performance when the subensemble performs badly. A predictor is penalized when both the subensemble and itself perform poorly. The measure ReTSP-Value is specifically designed for Reduce Error (RE) pruning for time series prediction. However, ReTSP-Value and ComTSP have the same flaw that, they could not guarantee the remaining predictor which supplements the subensemble the most will be selected. The cause of this flaw is that the predic-tive error in time series prediction is directional. It is not reasonable for these measures to take reducing error as the only goal while ignore the error direction. While our finally proposed measure ReTSP-Trend overcomes this defect, taking into consideration the trend of time series and the direction of forecasting error. It could indeed guarantee that the remaining predictor which supplements the subensemble the most will be selected. The comparison experiments on four benchmark financial time series datasets show that the measure ReTSP-Trend outperforms the other measures, which can remarkably improve the predictive ability and promote the generalization capability of the pruned ensembles for time series forecasting. Time series can be defined as a set of …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Kernel Learning Model for Prediction of Time Series Based on the Support Vector Regression and Meta Heuristic Search

In this paper, a method for predicting time series is presented. Time series prediction is a process which predicted future system values based on information obtained from past and present data points. Time series prediction models are widely used in various fields of engineering, economics, etc. The main purpose of using different models for time series prediction is to make the forecast with...

متن کامل

A Novel Fuzzy Based Method for Heart Rate Variability Prediction

Abstract In this paper, a novel technique based on fuzzy method is presented for chaotic nonlinear time series prediction. Fuzzy approach with the gradient learning algorithm and methods constitutes the main components of this method. This learning process in this method is similar to conventional gradient descent learning process, except that the input patterns and parameters are stored in mem...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

ارائه الگوریتمی مبتنی بر یادگیری جمعی به منظور یادگیری رتبه‌بندی در بازیابی اطلاعات

Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank has been shown to be useful in many applications of information retrieval, natural language processing, and data mining. Learning to rank can be described by two systems: a learning system and a ranking system. The learning system takes training data as input and constructs a ranking ...

متن کامل

Ensemble-based Top-k Recommender System Considering Incomplete Data

Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ratings to make robust predictions and suggesting qualified recommendations are two si...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 42  شماره 

صفحات  -

تاریخ انتشار 2015